Skip to content

test(flatkv): Expands docker-level FlatKV coverage #3417

Open
blindchaser wants to merge 9 commits into
mainfrom
yiren/docker-coverage
Open

test(flatkv): Expands docker-level FlatKV coverage #3417
blindchaser wants to merge 9 commits into
mainfrom
yiren/docker-coverage

Conversation

@blindchaser
Copy link
Copy Markdown
Contributor

@blindchaser blindchaser commented May 11, 2026

Summary

Expands docker-level FlatKV coverage across lattice-enabled, post-import, crash, state-sync, and local data-loss recovery paths. Adds an offline seidb import-flatkv-from-memiavl path that rebuilds FlatKV EVM state from memiavl, hardens the FlatKV importer against partial finalization, and he import command is a DR / test-seeding primitive; it preserves the production MigrationManager rollout boundary instead of replacing it.

  • sei-db/tools/cmd/seidb/operations/import_flatkv_from_memiavl.go: introduces import-flatkv-from-memiavl. It accepts only evm, rejects stale or future heights relative to memiavl.GetLatestVersion, refuses to overwrite committed FlatKV without --force, passes source module names to AddModule, polls importer errors between exporter reads, and aborts on failure so partial imports never finalize or write a snapshot.
  • sei-db/tools/cmd/seidb/main.go: registers the import command under seidb.
  • sei-db/state_db/sc/flatkv/import_translator.go: adds ImportTranslator, which maps memiavl changesets into FlatKV physical rows. It routes storage/code/legacy/non-EVM keys, buffers nonce/codehash account fragments across batches, emits one merged account row per address on Finalize, and returns ErrImportTranslatorFinalized after finalization.
  • sei-db/state_db/sc/flatkv/importer.go: makes KVImporter.Close idempotent, adds Abort(reason) to drain without FinalizeImport / WriteSnapshot, exposes Err() for fail-fast callers, and stores the test flush hook through atomic.Pointer to avoid future race-detector failures.
  • sei-db/state_db/sc/flatkv/wal_torn_write_test.go: adds flatkv-level WAL torn-write recovery coverage. It proves corrupted trailing WAL entries are discarded, clean WAL tails replay to the last appended commit, and recovered LtHash matches the last durable state.
  • integration_test/contracts/deploy_flatkv_evm_fixture.sh: deploys an EVM fixture and persists expected balance, storage, code, and missing-key observations for pre/post import RPC checks.
  • integration_test/contracts/import_flatkv_evm_cluster.sh: stops the devnet, imports EVM state into FlatKV on every validator, restarts in post-import dual_write with evm-ss-split=false and sc-enable-lattice-hash=false, and gates subsequent checks on block progress plus EVM RPC readiness.
  • integration_test/contracts/verify_cross_validator_flatkv_digest.sh: asserts sc-enable-lattice-hash=true in the from-genesis GIGA path and compares all FlatKV buckets (account, code, storage, legacy) at a shared committed height using dump-flatkv --height WAL replay.
  • integration_test/contracts/verify_statesync_flatkv_digest.sh: asserts lattice hash is enabled and compares donor vs state-sync receiver full-bucket FlatKV digests at a shared committed height.
  • integration_test/contracts/verify_flatkv_crash_recovery.sh: SIGKILLs one validator, verifies survivor progress, restarts the process, waits for catch-up, and compares canonical EVM buckets (account, code, storage) in the post-import lattice-disabled path.
  • integration_test/contracts/verify_flatkv_statesync_crash_recovery.sh: wipes one validator, starts state-sync, injects a SIGKILL during restore, restarts, waits for catch-up, and verifies canonical EVM bucket parity.
  • integration_test/contracts/verify_flatkv_total_loss_recovery.sh: simulates total local state loss, restores the validator via state-sync while preserving validator signing state, and verifies canonical EVM bucket parity.
  • integration_test/contracts/verify_flatkv_partial_loss_fails_loudly.sh: deletes only the FlatKV directory and requires either a clear startup failure or a fully self-healed canonical EVM digest match.
  • integration_test/contracts/verify_flatkv_evm_store.sh: dumps the storage bucket, verifies the fixture contract row, and documents the expected 41-byte serialized storage value layout.
  • integration_test/seidb/flatkv_evm_test.yaml: verifies historical and latest EVM balance, storage, and code reads before and after import, checks missing account/storage behavior, and confirms non-EVM modules remain queryable.
  • integration_test/contracts/AGENTS.md: documents the shared FlatKV Integration CI row, step ordering, destructive-test self-healing requirements, and the distinction between lattice-enabled full-bucket checks and post-import canonical EVM checks.
  • .github/workflows/integration-test.yml: keeps FlatKV-specific docker coverage under one FlatKV Integration row and appends crash, state-sync crash, total-loss, and partial-loss scenarios after import verification. It also runs full-bucket lattice-enabled digest checks inside Chain Operation Test.
  • sei-db/state_db/sc/migration/OPERATIONS.md: documents the MigrateEVM operational model, non-atomic memiavl/FlatKV write risks, failure modes, detection signals, recovery paths, and future tooling roadmap.

Test plan

  • sei-db/state_db/sc/flatkv/importer_test.go: covers Close idempotency on success and error, first-error-wins semantics, non-blocking AddNode after done, multi-flush large imports, abort-without-finalize behavior, abort after close, and explicit backpressure through the atomic flush hook.
  • sei-db/state_db/sc/flatkv/import_translator_test.go: covers nil/empty changesets, storage/code/legacy/non-EVM routing, nonce and codehash buffering, same-call and cross-call account merges, delete dropping, malformed key/value rejection, mixed storage/account batches, Finalize idempotency, and the Translate after Finalize sentinel error.
  • sei-db/state_db/sc/flatkv/wal_torn_write_test.go: covers mid-record truncation, partial length-prefix truncation, and clean WAL tails, asserting recovery to the last good version or replay to the last clean WAL commit with LtHash consistency.
  • sei-db/tools/cmd/seidb/operations/import_flatkv_from_memiavl_test.go: covers end-to-end EVM encoding through the CLI path, --force overwrite safety, height range checks, stale-height and future-height rejection, untouched FlatKV on rejection, and a large dataset that exercises importer flushes, worker backpressure, and cross-batch account merging.
  • Docker integration: Chain Operation Test verifies sc-enable-lattice-hash=true, state-sync donor/receiver full-bucket digest equality, and cross-validator full-bucket digest equality at committed heights. FlatKV Integration verifies fixture RPC reads before and after offline import, physical storage layout, SIGKILL restart recovery, state-sync mid-flight crash recovery, total-loss state-sync recovery, and FlatKV-only partial-loss failure behavior with diagnostic log dumps on failure.

Note

Medium Risk
Adds substantial new integration coverage and introduces/changes FlatKV import tooling and importer lifecycle semantics; failures could affect state migration/recovery workflows and test stability rather than runtime consensus logic.

Overview
Expands docker CI coverage for FlatKV by adding a consolidated FlatKV Integration matrix job plus new scripts that (1) deploy an EVM fixture, (2) run an offline memiavl→FlatKV import across validators, and (3) validate post-import correctness via EVM historical/latest RPC checks, on-disk dump-flatkv assertions, cross-node digest comparisons, and destructive recovery scenarios (SIGKILL crash recovery, state-sync crash recovery, total state loss recovery, and FlatKV-only loss behavior). Chain Operation Test also now verifies state-sync correctness by diffing FlatKV digests.

Adds and hardens FlatKV import tooling: introduces seidb import-flatkv-from-memiavl (evm-only, height safety checks, --force overwrite guard, fail-fast error polling) and a companion memiavl-latest-version helper for orchestrated multi-node imports. Under the hood it adds a new ImportTranslator to translate memiavl changesets into FlatKV physical rows (buffering and finalizing account writes), and updates KVImporter to support Err() reporting, idempotent Close(), and Abort() to prevent partial imports from finalizing/writing snapshots.

Improves test coverage with new unit tests for the translator, importer concurrency/lifecycle/backpressure/abort semantics, and FlatKV WAL torn-write recovery, plus an operational doc (sei-db/state_db/sc/migration/OPERATIONS.md) describing migration failure modes and recovery tooling roadmap.

Reviewed by Cursor Bugbot for commit 867afb8. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 11, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMay 13, 2026, 9:46 PM

@codecov
Copy link
Copy Markdown

codecov Bot commented May 11, 2026

Codecov Report

❌ Patch coverage is 56.87500% with 138 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.24%. Comparing base (26636ba) to head (867afb8).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...cmd/seidb/operations/import_flatkv_from_memiavl.go 41.74% 102 Missing and 18 partials ⚠️
sei-db/state_db/sc/flatkv/import_translator.go 84.84% 5 Missing and 5 partials ⚠️
sei-db/state_db/sc/flatkv/importer.go 86.95% 4 Missing and 2 partials ⚠️
sei-db/tools/cmd/seidb/main.go 0.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3417      +/-   ##
==========================================
- Coverage   59.34%   59.24%   -0.11%     
==========================================
  Files        2112     2112              
  Lines      174772   174431     -341     
==========================================
- Hits       103724   103338     -386     
- Misses      62010    62143     +133     
+ Partials     9038     8950      -88     
Flag Coverage Δ
sei-chain-pr 55.95% <56.87%> (?)
sei-db 70.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/tools/cmd/seidb/main.go 0.00% <0.00%> (ø)
sei-db/state_db/sc/flatkv/importer.go 90.12% <86.95%> (+2.28%) ⬆️
sei-db/state_db/sc/flatkv/import_translator.go 84.84% <84.84%> (ø)
...cmd/seidb/operations/import_flatkv_from_memiavl.go 41.74% <41.74%> (ø)

... and 52 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 00487eff98

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sei-db/tools/cmd/seidb/operations/import_flatkv_from_memiavl.go
…verage

`seidb import-flatkv-from-memiavl` copies a memiavl module (evm only) into
FlatKV at a given height. Adds the CLI + translator, KVImporter
Abort/Close-idempotent for partial-commit safety, unit tests for the
concurrency pipeline + a 50K-pair E2E, and a docker CI job asserting
post-import RPC reads match pre-import values.

Offline DR / test-seeding tool only; does NOT exercise the MigrationManager
(V0 -> V1) production path, and the post-import smoke is RPC-sample-level
not a full set comparison.

Co-authored-by: Cursor <cursoragent@cursor.com>
Comment thread integration_test/contracts/verify_statesync_flatkv_digest.sh Outdated
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch from 10c93b3 to 0414b90 Compare May 11, 2026 15:15
…ash recovery

- verify_statesync_flatkv_digest.sh: donor↔rpc-node dump-flatkv sha256 match
  at a chain height both nodes have committed, replacing the existing
  height-only state-sync probe.
- verify_cross_validator_flatkv_digest.sh: all 4 validators must agree on
  flatkv physical content at a common committed chain height, catching
  silent drift the AppHash path cannot.
- verify_flatkv_crash_recovery.sh: SIGKILL a validator mid block production,
  restart it, wait for catch-up, then 4-way digest match at a common
  committed chain height.
- Wire the first two into the GIGA "Chain Operation Test" job; add a new
  GIGA "FlatKV Crash Recovery" job for the SIGKILL test.

Compares via chain height + dump-flatkv WAL-replay (which works regardless
of flatkv SnapshotInterval) rather than intersecting flatkv snapshot dirs,
so the checks remain meaningful on CI devnets that never reach the default
SnapshotInterval=10000 block boundary.
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch from 0414b90 to f46dc25 Compare May 11, 2026 15:46
Comment thread integration_test/contracts/verify_cross_validator_flatkv_digest.sh
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch 2 times, most recently from ab3edcd to f89deef Compare May 12, 2026 05:29
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch from f89deef to c59c4dc Compare May 12, 2026 05:39
@blindchaser blindchaser changed the title test(flatkv): increase docker ci coverage test(flatkv): Expands docker-level FlatKV coverage May 12, 2026
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice!

Comment thread integration_test/contracts/verify_flatkv_statesync_crash_recovery.sh Outdated
Comment thread sei-db/tools/cmd/seidb/operations/import_flatkv_from_memiavl.go
echo "Importing evm module from memiavl into FlatKV on all validators..."
for i in $(seq 0 $((NODE_COUNT - 1))); do
docker exec "sei-node-$i" bash -lc "cd /sei-protocol/sei-chain && build/seidb import-flatkv-from-memiavl --modules=evm --data-dir /root/.sei/data"
done
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seidb binary built on one node, import runs on all

Medium Severity

The seidb binary is compiled only inside sei-node-0 (line 70), but the import loop on lines 95–97 executes build/seidb inside every validator container. All other verification scripts in this PR use an ensure_seidb helper that checks and builds the binary on each node independently. If the build output directory is not a shared Docker volume, sei-node-1 through sei-node-3 would fail with a missing binary.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 901f80b. Configure here.

Comment thread integration_test/contracts/verify_flatkv_statesync_crash_recovery.sh Outdated
}
imp.setErr(reason)
return imp.Close()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abort after successful Close corrupts Err state

Low Severity

Abort unconditionally calls setErr(reason) before delegating to Close. If Close already completed successfully (with firstErr still nil), the CompareAndSwap in setErr succeeds, poisoning firstErr with the abort reason. After that, Err() returns the abort reason while Close() returns the cached nil — the two disagree. While the current CLI tool never hits this path because the defer only calls Abort when err != nil, any future caller querying Err() after a successful Close + stale Abort would see a spurious error.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 52870bd. Configure here.

@blindchaser blindchaser force-pushed the yiren/docker-coverage branch 3 times, most recently from 66cc74b to 49ecf72 Compare May 13, 2026 03:04
"grep -Eiq 'flatkv|version|missing|LoadVersion|reconcile|state_commit' /sei-protocol/sei-chain/build/generated/logs/seid-${VICTIM_INDEX}.log"; then
echo "PASS: $VICTIM_NODE failed loudly after FlatKV-only loss"
exit 0
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly broad grep masks unrelated crash as PASS

Medium Severity

The grep -Eiq pattern includes the term version, which is so broad it matches virtually any seid log file (e.g. "app version", "tendermint version" lines printed at startup). This means if the node crashes after FlatKV deletion for an unrelated reason (OOM, config error, unrelated panic), the grep still matches and the test reports PASS: failed loudly after FlatKV-only loss. The test's stated contract — that the node must detect and report the storage failure — is not actually enforced. A more specific pattern (e.g. requiring flatkv or state_commit without the catch-all version) would prevent the false positive.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 49ecf72. Configure here.

@blindchaser blindchaser force-pushed the yiren/docker-coverage branch 3 times, most recently from a200f76 to c417fdf Compare May 13, 2026 14:29
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 5 total unresolved issues (including 4 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit c417fdf. Configure here.

Comment thread integration_test/contracts/verify_flatkv_total_loss_recovery.sh
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch 3 times, most recently from 20fce20 to c7b91e7 Compare May 13, 2026 16:56
Co-authored-by: Cursor <cursoragent@cursor.com>
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch from c7b91e7 to d1c8779 Compare May 13, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants